Overview

Dataset Statistics

Number of Variables 5
Number of Rows 1.1508e+06
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 146
Duplicate Rows (%) 0.0%
Total Size in Memory 132.8 MB
Average Row Size in Memory 121.0 B
Variable Types
  • Numerical: 4
  • Categorical: 1

Dataset Insights

item_id is skewed Skewed
price is skewed Skewed
quantity is skewed Skewed
timestamp has a high cardinality: 1136477 distinct values High Cardinality
timestamp has constant length 24 Constant Length
price has 610030 (53.01%) zeros Zeros
quantity has 610030 (53.01%) zeros Zeros

Variables


session_id

numerical

Approximate Distinct Count 509696
Approximate Unique (%) 44.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 18412048
Mean 5.9149e+06
Minimum 11
Maximum 11562121
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • session_id is skewed left (γ1 = -0.0583)

Quantile Statistics

Minimum 11
5-th Percentile 516331
Q1 2.9014e+06
Median 5.9394e+06
Q3 8.7601e+06
95-th Percentile 1.0978e+07
Maximum 11562121
Range 11562110
IQR 5.8588e+06

Descriptive Statistics

Mean 5.9149e+06
Standard Deviation 3.3474e+06
Variance 1.1205e+13
Sum 6.8066e+12
Skewness -0.05825
Kurtosis -1.1918
Coefficient of Variation 0.5659

timestamp

categorical

Approximate Distinct Count 1136477
Approximate Unique (%) 98.8%
Missing 0
Missing (%) 0.0%
Memory Size 102417017

Length

Mean 24
Standard Deviation 0
Median 24
Minimum 24
Maximum 24

Sample

1st row 2014-04-06T18:44:5...
2nd row 2014-04-06T18:44:5...
3rd row 2014-04-06T09:40:1...
4th row 2014-04-04T06:13:2...
5th row 2014-04-04T06:13:2...

Letter

Count 2301506
Lowercase Letter 0
Space Separator 0
Uppercase Letter 2301506
Dash Punctuation 2301506
Decimal Number 19562801
  • timestamp contains many words: 1136477 words
  • timestamp has words of constant length

item_id

numerical

Approximate Distinct Count 19949
Approximate Unique (%) 1.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 18412048
Mean 2.2045e+08
Minimum 214507331
Maximum 1178837797
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • item_id is skewed right (γ1 = 8.5249)

Quantile Statistics

Minimum 214507331
5-th Percentile 2.1455e+08
Q1 2.1472e+08
Median 2.1483e+08
Q3 2.1485e+08
95-th Percentile 2.1485e+08
Maximum 1178837797
Range 964330466
IQR 133093

Descriptive Statistics

Mean 2.2045e+08
Standard Deviation 4.8973e+07
Variance 2.3984e+15
Sum 2.5369e+14
Skewness 8.5249
Kurtosis 70.795
Coefficient of Variation 0.2221
  • item_id is not normally distributed (p-value 4.268515703601139e-25)
  • item_id has 31742 outliers

price

numerical

Approximate Distinct Count 735
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 18412048
Mean 1423.5265
Minimum 0
Maximum 334998
Zeros 610030
Zeros (%) 53.0%
Negatives 0
Negatives (%) 0.0%
  • price is skewed right (γ1 = 12.9938)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 1046
95-th Percentile 6282
Maximum 334998
Range 334998
IQR 1046

Descriptive Statistics

Mean 1423.5265
Standard Deviation 4651.5488
Variance 2.1637e+07
Sum 1.6381e+09
Skewness 12.9938
Kurtosis 357.527
Coefficient of Variation 3.2676
  • price is not normally distributed (p-value 4.437403332664008e-25)
  • price has 143247 outliers

quantity

numerical

Approximate Distinct Count 28
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 18412048
Mean 0.6461
Minimum 0
Maximum 30
Zeros 610030
Zeros (%) 53.0%
Negatives 0
Negatives (%) 0.0%
  • quantity is skewed right (γ1 = 8.63)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 1
95-th Percentile 2
Maximum 30
Range 30
IQR 1

Descriptive Statistics

Mean 0.6461
Standard Deviation 1.1445
Variance 1.3099
Sum 743486
Skewness 8.63
Kurtosis 152.1296
Coefficient of Variation 1.7715
  • quantity is not normally distributed (p-value 2.6223748373203384e-19)
  • quantity has 30172 outliers

Interactions

Correlations

Missing Values